Uncertainty propagation for noise robust speaker recognition: the case of NIST-SRE

نویسندگان

  • Dayana Ribas González
  • Emmanuel Vincent
  • José Ramón Calvo de Lara
چکیده

Uncertainty propagation is an established approach to handle noisy and reverberant conditions in automatic speech recognition (ASR), but it has little been studied for speaker recognition so far. Yu et al. recently proposed to propagate uncertainty to the Baum-Welch (BW) statistics without changing the posterior probability of each mixture component. They obtained good results on a small dataset (YOHO) but little improvement on the NIST-SRE dataset, despite the use of oracle uncertainty estimates. In this paper, we propose to modify the computation of the posterior probability of each mixture component in order to obtain unbiased BW statistics. We show that our approach improves the accuracy of BW statistics on the Wall Street Journal (WSJ) corpus, but yields little or no improvement on NIST-SRE again. We provide a theoretical explanation for this that opens the way for more efficient exploitation of uncertainty on NIST-SRE and other large datasets in the future.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

UTD-CRSS Systems for 2012 NIST Speaker Recognition Evaluation The CRSS SRE Team

This document briefly describes the systems submitted by the Center for Robust Speech Systems (CRSS) from The University of Texas at Dallas (UTD) for the 2012 NIST Speaker Recognition Evaluation. We developed a state-of-the-art i-vector based speaker recognition system [1]. Probabilistic linear discriminant analysis (PLDA) [2] along with several other backends are used for channel/noise compens...

متن کامل

The QUT-NOISE-SRE protocol for the evaluation of noisy speaker recognition

The QUT-NOISE-SRE protocol is designed to mix the large QUT-NOISE database, consisting of over 10 hours of background noise, collected across 10 unique locations covering 5 common noise scenarios, with commonly used speaker recognition datasets such as Switchboard, Mixer and the speaker recognition evaluation (SRE) datasets provided by NIST. By allowing common, clean, speech corpora to be mixed...

متن کامل

Session variability compensation in speaker and language recognition

This report summarises the research work performed by the author in order to start his Ph.D Thesis which is based on robust automatic speaker and language recognition. One of the main causes of errors in automatic speaker and language recognition systems is due to intrinsic variability between sessions of a same speaker. This variability known as session or channel variability is caused by seve...

متن کامل

Discriminative subspace modeling of SNR and duration variabilities for robust speaker verification

Although i-vectors together with probabilistic LDA (PLDA) have achieved a great success in speaker verification, how to suppress the undesirable effects caused by the variability in utterance length and background noise level is still a challenge. This paper aims to improve the robustness of i-vector based speaker verification systems by compensating for the utterance-length variability and noi...

متن کامل

Comparison of Voice Activity Detectors for Interview Speech in NIST Speaker Recognition Evaluation

Interview speech has become an important part of the NIST Speaker Recognition Evaluations (SREs). Unlike telephone speech, interview speech has substantially lower signal-to-noise ratio, which necessitates robust voice activity detection (VAD). This paper highlights the characteristics of interview speech files in NIST SREs and discusses the difficulties in performing speech/nonspeech segmentat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015